AITopics | yellow block

Planning remains a core challenge for large language models (LLMs), particularly in domains that require coherent multi-step action sequences grounded in external constraints. We introduce SymPlanner, a novel framework that equips LLMs with structured planning capabilities by interfacing them with a symbolic environment that serves as an explicit world model. Rather than relying purely on natural language reasoning, SymPlanner grounds the planning process in a symbolic state space, where a policy model proposes actions and a symbolic environment deterministically executes and verifies their effects. To enhance exploration and improve robustness, we introduce Iterative Correction (IC), which refines previously proposed actions by leveraging feedback from the symbolic environment to eliminate invalid decisions and guide the model toward valid alternatives. Additionally, Contrastive Ranking (CR) enables fine-grained comparison of candidate plans by evaluating them jointly. Conceptually, SymPlanner operationalizes two cognitive faculties: (i) error monitoring and repair via externalized feedback (IC) and (ii) preference formation among alternatives via pairwise comparison (CR), advancing cognitively plausible, symbol-grounded planning aligned with the rich structure in intelligent systems. We evaluate SymPlanner on PlanBench, demonstrating that it produces more coherent, diverse, and verifiable plans than pure natural language baselines.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.01479

Country: North America > United States (0.28)

Genre: Workflow (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Large Reasoning Models in Agent Scenarios: Exploring the Necessity of Reasoning Capabilities

Zhou, Xueyang, Tie, Guiyao, Zhang, Guowen, Wang, Weidong, Zuo, Zhigang, Wu, Di, Chu, Duanfeng, Zhou, Pan, Sun, Lichao, Gong, Neil Zhenqiang

arXiv.org Artificial IntelligenceMar-14-2025

The rise of Large Reasoning Models (LRMs) signifies a paradigm shift toward advanced computational reasoning. Yet, this progress disrupts traditional agent frameworks, traditionally anchored by execution-oriented Large Language Models (LLMs). To explore this transformation, we propose the LaRMA framework, encompassing nine tasks across Tool Usage, Plan Design, and Problem Solving, assessed with three top LLMs (e.g., Claude3.5-sonnet) and five leading LRMs (e.g., DeepSeek-R1). Our findings address four research questions: LRMs surpass LLMs in reasoning-intensive tasks like Plan Design, leveraging iterative reflection for superior outcomes; LLMs excel in execution-driven tasks such as Tool Usage, prioritizing efficiency; hybrid LLM-LRM configurations, pairing LLMs as actors with LRMs as reflectors, optimize agent performance by blending execution speed with reasoning depth; and LRMs' enhanced reasoning incurs higher computational costs, prolonged processing, and behavioral challenges, including overthinking and fact-ignoring tendencies. This study fosters deeper inquiry into LRMs' balance of deep thinking and overthinking, laying a critical foundation for future agent design advancements.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.11074

Country:

North America > United States > Colorado (0.04)
Europe > France (0.04)
North America > United States > New York (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.67)
Leisure & Entertainment (0.67)
Consumer Products & Services (0.46)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Complex LLM Planning via Automated Heuristics Discovery

Ling, Hongyi, Parashar, Shubham, Khurana, Sambhav, Olson, Blake, Basu, Anwesha, Sinha, Gaurangi, Tu, Zhengzhong, Caverlee, James, Ji, Shuiwang

arXiv.org Artificial IntelligenceFeb-26-2025

We consider enhancing large language models (LLMs) for complex planning tasks. While existing methods allow LLMs to explore intermediate steps to make plans, they either depend on unreliable self-verification or external verifiers to evaluate these steps, which demand significant data and computations. Here, we propose automated heuristics discovery (AutoHD), a novel approach that enables LLMs to explicitly generate heuristic functions to guide inference-time search, allowing accurate evaluation of intermediate states. These heuristic functions are further refined through a heuristic evolution process, improving their robustness and effectiveness. Our proposed method requires no additional model training or fine-tuning, and the explicit definition of heuristic functions generated by the LLMs provides interpretability and insights into the reasoning process. Extensive experiments across diverse benchmarks demonstrate significant gains over multiple baselines, including nearly twice the accuracy on some datasets, establishing our approach as a reliable and interpretable solution for complex planning tasks.

arxiv preprint arxiv, heuristic function, llm, (15 more...)

arXiv.org Artificial Intelligence

2502.19295

Country:

North America > United States > Texas (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

On the Limit of Language Models as Planning Formalizers

Huang, Cassie, Zhang, Li

arXiv.org Artificial IntelligenceDec-13-2024

Large Language Models have been shown to fail to create executable and verifiable plans in grounded environments. An emerging line of work shows success in using LLM as a formalizer to generate a formal representation (e.g., PDDL) of the planning domain, which can be deterministically solved to find a plan. We systematically evaluate this methodology while bridging some major gaps. While previous work only generates a partial PDDL representation given templated and thus unrealistic environment descriptions, we generate the complete representation given descriptions of various naturalness levels. Among an array of observations critical to improve LLMs' formal planning ability, we note that large enough models can effectively formalize descriptions as PDDL, outperforming those directly generating plans, while being robust to lexical perturbation. As the descriptions become more natural-sounding, we observe a decrease in performance and provide detailed error analysis.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2412.09879

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Singapore (0.04)
North America > United States (0.04)
(6 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Filters

Collaborating Authors

yellow block

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

efb2072a358cefb75886a315a6fcf880-Supplemental-Conference.pdf

A Appendix Contents

Demo2Code: From Summarizing Demonstrations to Synthesizing Code via Extended Chain-of-Thought

efb2072a358cefb75886a315a6fcf880-Supplemental-Conference.pdf

A Appendix Contents

30699996ff411d48903c9752b782a5c1-Paper-Conference.pdf

Deliberate Planning in Language Models with Symbolic Representation

Large Reasoning Models in Agent Scenarios: Exploring the Necessity of Reasoning Capabilities

Complex LLM Planning via Automated Heuristics Discovery

On the Limit of Language Models as Planning Formalizers